fix(runtime): use provider_opts.context_size for compaction by dgageot · Pull Request #2814 · docker/docker-agent

dgageot · 2026-05-18T10:05:01Z

Summary

When Docker Model Runner (DMR) is configured with a model that isn't catalogued in models.dev — typically a HuggingFace GGUF such as huggingface.co/unsloth/qwen3.5-4b-gguf:Q4_K_M — automatic compaction silently became a no-op:

compactionContextLimit returned 0, so the LLM strategy bailed.
The proactive 90% trigger in runStreamLoop never fired.
Post-overflow recovery surfaced as Failed to get model definition every time the assistant tried to respond, with no way out.

The user already supplies a context_size in provider_opts for DMR to size the inference context. The runtime now uses that same value as the authoritative context limit when set, falling back to the models.dev catalogue otherwise. This keeps planning aligned with what the engine actually enforces.

Resolution order

provider_opts.context_size (when set and parseable as a positive integer)
models.dev catalogue limit
0 — caller treats as "can't compact"

A single LocalRuntime.resolveContextLimit helper is the source of truth, used by:

compactionContextLimit (LLM compaction strategy)
runStreamLoop proactive 90% trigger
EmitStartupInfo sidebar context-percent on session restore
compactWithReason post-compaction TokenUsageEvent

So the sidebar, the proactive trigger, and the LLM compactor all plan against the same number.

Tests

12-case helper matrix covering int / int64 / int32 / float64 / float32 / string-decimal / whitespace / non-numeric / negative / zero / bool / missing-key / nil-opts.
Nil-provider safety.
provider_opts.context_size takes precedence over the catalogue.
Falls back to the catalogue when context_size is unset.
Falls back to provider_opts.context_size when modelsStore.GetModel errors (the exact reported scenario).
Returns 0 when neither source yields a usable limit.

`task lint` — 0 issues. `task test` — full suite passes.

Closes #2800

Local models not catalogued in models.dev (e.g. DMR with HuggingFace GGUFs) can now supply context_size via provider_opts to enable compaction. When models.dev lookup fails, the runtime falls back to this user-supplied limit, making compaction (proactive threshold and post-overflow recovery) functional for uncatalogued models. Fixes docker#2800

Self-review of the previous commit surfaced four issues: * compactIfNeeded carried an unused *modelsdev.Model parameter; drop it and let the call sites pass the resolved contextLimit only. * EmitStartupInfo and compactWithReason did their own catalogue-only lookup, so the sidebar's context-percent and the post-compaction TokenUsageEvent stayed inconsistent with the freshly-fixed compaction triggers in loop.go and session_compaction.go. * The provider_opts.context_size fallback was second-class. The user typed that number in their config, and DMR allocates exactly that much; treat it as authoritative when set, with the catalogue as fallback. This also makes the resolution monotonic across providers rather than depending on whether the catalogue has the model. * The dual implementation of priority order (catalogue-first in runStreamLoop, provider-first elsewhere) was a footgun. Extract resolveContextLimit on LocalRuntime as the single source of truth. compactionContextLimit, runStreamLoop, EmitStartupInfo and compactWithReason now route through it, so the sidebar, the proactive trigger and the LLM compactor all plan against the same number.

docker-agent · 2026-05-18T10:35:56Z

❌ PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

dgageot added 2 commits May 18, 2026 11:52

dgageot requested a review from a team as a code owner May 18, 2026 10:05

rumpl approved these changes May 18, 2026

View reviewed changes

dgageot merged commit cf296d8 into docker:main May 18, 2026
8 checks passed

BrewTestBot mentioned this pull request May 18, 2026

docker-agent 1.60.0 Homebrew/homebrew-core#283424

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(runtime): use provider_opts.context_size for compaction#2814

fix(runtime): use provider_opts.context_size for compaction#2814
dgageot merged 2 commits into
docker:mainfrom
dgageot:board/67262d46e6c8e609

dgageot commented May 18, 2026

Uh oh!

docker-agent commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dgageot commented May 18, 2026

Summary

Resolution order

Tests

Uh oh!

docker-agent commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants